SciClaimEval

Welcome to SciClaimEval, a pilot task on the verification of scientific claims against tables and figures from scientific articles. The task is organized as part of NTCIR-19 and aims to evaluate systems that can reliably check the truthfulness of scientific statements using multi-modal evidence.

Scientific claim verification involves determining whether claims made in research papers are supported or refuted by accompanying evidence, such as experimental results, tables, and figures. With the rapid rise of generative AI and large language models (LLMs), the volume of scientific submissions has increased substantially, creating a growing demand for tools that can assist reviewers in assessing the validity and consistency of paper claims.

The SciClaimEval pilot task focuses on cross-modal scientific claim verification, aiming to assess whether textual claims in scientific papers are adequately supported by evidence from diverse modalities, namely tables and figures. We introduce a new benchmark dataset, constructed by extracting claims and their corresponding evidence from scientific articles across multiple domains, including biomedicine, machine learning, and natural language processing.

News

To be announced.

Important Dates

January 2026 - Dataset release
January-June 2026 - Dry run
March-July 2026 - Formal run
August 1, 2026 - Evaluation results return
August 1, 2026 - Task overview release
September 1, 2026 - Submission due of participant papers
December 8-10, 2026 - NTCIR-19 conference

Organizers

Akiko Aizawa (National Institute of Informatics)
André Greiner-Petter (University of Göttingen, Germany)
Florian Boudin (Inria, France)
Xanh Ho (National Institute of Informatics)